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Abstract 

In this paper we study noisy sorting without re-sampUng. In this problem there is an unknown order 
Ottji) < . . . < a^(n) where tt is a permutation on n elements. The input is the status of (2) queries 
of the form q{ai,Xj), where q{ai,aj) = + with probability at least 1/2 + 7 if 7r(i) > 7r(j) for all 
pairs i 7^ j, where 7 > is a constant and q{ai, Uj) ~ —q{aj, Ui) for all i and j. It is assumed that 
the errors are independent. Given the status of the queries the goal is to find the maximum likelihood 
order In other words, the goal is find a permutation a that minimizes the number of pairs a{i) > a{j) 
where q{a{i), = —■ The problem so defined is the feedback arc set problem on distributions of 

inputs, each of which is a tournament obtained as a noisy perturbations of a linear order. Note that when 
7 < 1/2 and n is large, it is impossible to recover the original order tt. 

It is known that the weighted feedback are set problem on tournaments is NP-hard in general. 
Here we present an algorithm of running time n'^^'^ ^ and sampling complexity 0-y(7ilogn) that 
with high probability solves the noisy sorting without re-sampling problem. We also show that if 
0(7(1) J 0(7(2) J • ■ • 7 0(7(n) is an optimal solution of the problem then it is "close" to the original order. More 
formally, with high probability it holds that |(t(z) — 7r(i)| = Q{n) andmax^ \a{i) — Tr{i) \ = 0(logn). 

Our results are of interest in applications to ranking, such as ranking in sports, or ranking of search 
items based on comparisons by experts. 
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1 Introduction 



We study the problem of sorting in the presence of noise. While sorting linear orders is a classical well 
studied problem, the introduction of noise poses very interesting challenges. Noise has to be considered 
when ranking or sorting is applied in many real life scenarios. 

A natural example comes from sports. How do we rank a league of soccer teams based on the outcome 
of the games? It is natural to assume that there is a true underlying order of which team is better and that 
the games outcome represent noisy versions of the pairwise comparisons between teams. Note that in this 
problem it is impossible to "re-sample" the order between a pair of teams. As a second example, consider 
experts comparing various items according to their importance where each pair of elements is compared by 
one expert. It is natural to assume that the experts opinions represent a noisy view of the actual order of 
significance. The question is then how to aggregate this information? 

1.1 The Sorting Model 

We will consider the following probabilistic model of instances. There will be n items denoted ai, . . . , a^- 
There will be a true order given by a permutation vr on n elements such that under the true order a7r(i) < 
a^(2) • • • < 0'-K{n~i) < '3^7r(n)- The algorithm will have access to (2) queries defined as follows. 

Definition 1. For each pair i^j the outcome of the comparison between ai and a j is denoted by q{ai,aj) G it 
where for all i ^ j it holds that q{ai, aj) = —q{aj, Cj). We assume that the probability q{ai, aj) = + is at 
least p := | + j if 7r{i) > 7r(j) and that the queries 

{q{ai,aj) : I < i < j < n} 

are independent conditioned on the true order. In other words, for any set 

s = {{i{i)<j{i)),...,m<m)}, 

any vector s G {^}^ md {i < j) ^ S it holds that 

V[q{ai,aj) = +|V1 < £ < k : q{ai(^t)iaj{t,) = Si] = ^[q{ai,aj) = +]. (1) 
It is further assumed that l/2<p=^+7<l. 

We will be interested in finding a ranking that will minimize the number of upsets. More formally: 
Definition 2. Given (2) queries q{ai, aj) the score Sq{a) of a ranking (permutation) a is given by 

Sgi'^) = X] (2) 

i,j:a{i)>a{j) 

We say that a ranking r is optimal /or qifr is a maximizer (|2]) among all ranking. 

The Noisy Sorting Without Resampling (NSWR) problem is the problem of finding an optimal r given 
q assuming that q is generated as in Definition |7] 

The problem of maximizing ^ without any assumptions on the input distribution is called the feedback 
arc set problem for tournaments which is known to be NP-hard, see subsection 11.21 for references, more 
background and related models. 
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The score (E)) has a clear statistical interpretation in the case where each query is answered correctly 
with probability p exactly In this case, for each permutation a we can calculate the probability ^[glcT] of 
observing q given that a is the true order. It is immediate to verify that logP[q|(j] = asq{a) + b for two 
constants a > 0,6. Thus in this case the optimal solution to the NSWR problem is identical with the 
maximum likelihood order that is consistent with q. This in particular implies that given a prior uniform 
distribution on the n\ rankings, any order a maximizing ^ is also a maximizers of the posterior probability 
given q. So by analogy to problems in coding theory, see e.g. |7j, a is a maximum likelihood decoding of 
the original order vr. 

Note furthermore that one should not expect to be able to find the true order if q is noisy. Indeed for any 
pair of adjacent elements we are only given one noisy bit to determine which of the two is bigger. 



1.2 Related Sorting Models and Results 

It is natural to consider the problem of finding an a ranking a that minimizes the score Sq{a) without making 
any assumptions on the input q. This problem, called the feedback arc set problem for tournaments is known 
to be NP hard II1I3- However, it does admit PTAS [61 achieving a (1 + e) approximation for 




in time that is polynomial in n and doubly exponential in 1/e. The results of |i6i| are the latest in a long line 
of work starting in the 1960's and including |[T]|2l. See IS for a detailed history of the feedback arc set 
problem. 

A problem that is in a sense easier than NSWR is the problem where repetitions are allowed in querying. 
In this case it is easy to observe that the original order may be recovered in 0{n log^ n) queries with high 
probability. Indeed, one may perform any of the standard 0(n log n) sorting algorithms and repeat each 
query 0(log n) times in order to obtain the actual order between the queries elements with error probability 
(say). More sophisticated methods allow to show that in fact the true order may be found in query 
complexity O(nlogn) with high probability |4|, see also |5|. 



1.3 Main Results 

In our main results we show that the NSWR problem is solvable in polynomial time with high probability 
and that any optimal order is close to the true order. More formally we show that 

Theorem 3. There exists a randomized algorithm that for any 7 > and (3 > finds an optimal solution 
to the noisy sorting without resampling (NSWR) problem in time n'^((l^+^')^ ) except with probability n~^. 

Theorem 4. Consider the NSWR problem and let vr be the true order and a be any optimal order than except 
with probability 0{n~^) it holds that 

n 

^\a{i)-7T{i)\ = 0{n), (3) 

i=l 

max |(T(i) — 7r(i)| = O(logn). (4) 

i 

Utilizing some of the techniques of lH it is possible to obtain the results of Theorem |3] with low sampling 
complexity. More formally, 
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Theorem 5. There is an implementation of a sorting algorithm with the same guarantees as in Theorem \3\ 
and whose sampling complexity is C n log n where C = C{(3,^). 

It should be noted that the proofs can be modified to a more general case where the conditional proba- 
bility from ([T]) is always bounded from below by p without necessarily being independent. 

1.4 Techniques 

In order to obtain a polynomial time algorithm for the NSWR problem is important to identify that any 
optimal solution to the problem is close to the true one. Thus the main step of the analysis is the proof of 
Theorem m 

To find efficient sorting we use an insertion algorithm. Given an optimal order on a subset of the items 
we show how to insert a new element. Since the optimal order both before and after the insertion of the 
element has to satisfy Theorem IH it is also the case that no element moves more than O(logn) after the 
insertion and re-sorting. Using this and a dynamic programing approach we derive an insertion algorithm 
in Section |2] The results of this section may be of independent interest in cases where it is known that a 
single element insertion into an optimal suborder cannot result in a new optimal order where some elements 
moved by much. 

The main task is to to prove Theorem|4]in Section[3] We first prove ^ by showing that for a large enough 
constant c, it is unlikely that any order a whose total distance is more than an will have Sq{a) > Sq(7r), 
where vr is the original order. We then establish ^ in subsection 13.21 using a bootstrap argument. The 
argument is based on the idea that if the discrepancy in the position of an element a in an optimal order 
compared to the true order is more than c log n for a large constant c, then there must exist many elements 
that are "close" to a that have also moved by much. This then leads to a contradiction with ([3]). 

The final analysis of the insertion algorithm and the proof of Theorem |3] are provided in Section |4l 
Section |5] shows how using a variant of the sorting algorithm it is possible to achieve polynomial running 
time in sampling complexity 0{n log n). 

1.5 Distances between rankings 

Here we define a few measures of distance between rankings that will be used later. First, given two permu- 
tations a and r we define the dislocation distance by 

d{a,T)=J2W{r)-r{^\- 

i=l 

Given a ranking vr we define (7^ G {ztjCaO so that qT^{ai,aj) = + if 7r{i) > 7r{j) and qTj{ai,aj) = — 
otherwise. Note that using this notation q is obtained from q.,^ by flipping each entry independently with 
probability \ — p = 1/2 — 7. Given g, q' € {±}^ 2 ) we denote by 

d{q,q') = i^^W.j) - q'{i,j)\ 

i<j 

We will write d{a) for d{a, id) where id is the identity pemiutation and d{q) for d{q, qid). Below we will 
often use the following well known claim Q. 

Claim 6. For any r, 

\d{T) < d{qr) < d{T). 
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2 Sorting a presorted list 



In this section we prove that if a list is pre-sorted so that each element is at most k positions away from its 
location in the optimal ordering, then the optimal sorting can be found in time 0{n'^ ■ 2^^). 

Lemma 7. Let ai, 02, . . ., Un be n elements together with noisy queries q. Suppose that we are given that 
there is an optimal ordering aCT{i) i '^(t(2) 1 ■ ■ ■ 1 (^a{n)' such that — i\ < kfor all i. Then we can find such 
an optimal a in time 0{n? ■ 2^^). 

In the applications below k will be O(logn). Note that a brute force search over all possible a would 
require time 

/,e{n)_ Instead we use dynamic programing to reduce the running time. 

Proof. We use a dynamic programming technique to find an optimal sorting. In order to simplify notation 
we assume that the true ranking vr is the identity ranking. In other words, ai < 02 . . . < a„. Let i < j be 
any indices, then by the assumption, the elements in the optimally ordered interval 

I = [acr(j) , «cr(i+l) ; • • • , «(T(j)] 

satisfy d I <Z where 

Hence selecting the set 5/ = {a^-^j) , acr(j+i) ) • • • ) } involves choosing a set of size j — i + 1 that contains 
the elements of /~ and is contained in /+. This involves selecting 2k elements from the list (or from a subset 
of the hst) 

{Oj-fc, fli-fc+l, ■ ■ ■ , aj+A;-l; Oj-fc+li • • • jflj-fc} 

which has 4fc elements. Thus the number of such 5/'s is bounded by 2'^^. 

We may assume without loss of generality that n is an exact power of 2. Denote by /q the interval 
containing all the elements. Denote by Ii the left half of /q and by I2 its right half. Denote by /a the left 
half of Ji and so on. In total, we will have n — 1 intervals of lengths 2, 4, 8, . . .. 

For each It = [oj, . . . , aj] let St denote the possible (< 2^^^) sets of the elements ![ = [ao-(j) , • • • , fl(7(j)]- 
We use dynamic programming to store an optimal ordering of each such I[ G St- The total number of s 
we will have to consider is bounded by n ■ 2^'^. We proceed from t = n — 1 down to i = producing and 
storing an optimal sort for each possible I^. For t = n — l,n — 2, . . . ,n/2 the length of each is 2, and the 
optimal sort can be found in 0(1) steps. 

Now let t < n/2. We are trying to find an optimal sort of a given Ij. = + 2s — 1]. We do this 
by dividing the optimal sort into two halves Ii and Ir and trying to sort them separately. We know that Ii 
must contain all the elements in It that come from the interval [oi, . . . , aj+s_i_fc] and must be contained in 
the interval [ai, . . . , Oj+<j-i+fc]. Thus there are at most 2^^^ choices for the elements of and the choice 
of determines Ir uniquely. For each such choice we look up an optimum solution for Ii and for If in 
the dynamic programming table. Among all possible choices of Ii we pick the best one. This is done by 
recomputing the score Sy for the joined interval, and takes at most j/^'p time. Thus the total cost will be 

logn logn , 4^. . 

#intervals of length 2' ■ #checks • cost of check =J2'^[ r~ ' '^^^ ' = ^(^^ ' 

i=l i=l ^ ^ 

□ 
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3 The Discrepancy between the true order and Optima 



The goal of this section is to establish that with high probability any optimum solution will not be far from 
the original solution. We first establish that the orders are close on average, and then that they are pointwise 
close to each other. 



3.1 Average proximity 

We prove that with high probability, the total difference between the original and any optimal ordering is 
linear in the length of the interval. 

We begin by bounding the probability that a specific permutation a will beat the original ordering. 

Lemma 8. Suppose that the original ordering is ai < a2 ■ ■ ■ < Let a be another permutation. Then the 
probability that a beats the identity permutation is bounded from above by 

P[Bin{d{q,), 1/2 + 7) < d{q,)/2] < exp{-2diq„)^^) 

Proof. In order for a to beat the identity, it needs to beat it in at least half of the d{qa) pairwise relation where 
they differ. This proves that the probability that it beats the identity is exactly P[Bin{d{qa), 1/2 + 7) < 
d{qa) /2]. The last inequality follows by a Chernoff bound. □ 

Lemma 9. The number of permutations r on [n] satisfying d^r) < cnis at most 

2" 2(i+c)"'f^(i/(i+c)) 

Here H{x) is the binary entropy of x defined by 

H{x) = —X log2 X — {1 — x) log2(l — x) < — 2x log2 X, 

for small x. 

Proof. Note that each r can be uniquely specified by the values of s{i) = T{i) — i, that we are given that 
\s{i) \ is exactly (i(r) < cn. Thus there is an injection of r's with ^(t) = m into sequences of n numbers 
which in absolute values add up to m. It thus suffices to bound the number of such sequences. The number 
of unsigned sequences equals the number of ways of placing m balls in n bins, which is equal to ("^™^^)- 
Signs multiply the possibilities by at most 2". Hence the total number of r's with d{T) = m is bounded by 
2" • ("^™]~^)- Summing up over the possible values of m we obtain 

^^2"^ ■ (^^^^~^^ <2"- ■ (^^ < 2" 2("+''") Hin/{n+cn))_ ^^^^ 



m=0 



□ 



Lemma 10. Suppose that the true ordering is ai < . . . < an and n is large enough. Then ifc>l and 

-f^O l + (l+c)i7(l/(l+c)), 

the probability that any ranking a is optimal and d{a) > cn is at most exp(— cn7^/10) for sufficiently large 
n. In particular, cj^' 7 — > 0, it suffices to take 

c = 0(-7-2log7) =0(7-2). 

Proof. Let a be an ordering with d{a) > cn. Then by Claim |6] we have d{q(j) > cn/2. Therefore the 
probability that such an ordering will beat the identity is bounded by exp(— 07172) by Lemma [8] We now 
use union bound and Lemma|9]to obtain the desired result. □ 
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3.2 Pointwise proximity 

In the previous section we have seen that it is unlikely that the average element in the optimal order is 
more than a constant number of positions away from its original location. Our next goal is to show that 
the maximum dislocation of an element is bounded by O(logri). As a first step, we show that one "big" 
dislocation is likely to entail many "big" dislocations. 

Lemma 11. Suppose that the true ordering o/oi, . . . ,anis given by the identity ranking, i.e., ai < 02 . . . < 
a„. Let 1 < i < j < nbe two indices and m = j — i. Let Aij be the event that there is an optimum ordering 
a such that a{i) = j and 

{a[l, i-i-l]Ua[j+£+l, n]) n -!]<£, 

i.e., at most i elements are mapped to the interval [i, j — l]from outside the interval [i — + i] by a, where 
I = |_|7?TiJ ■ Then 

where pi = exp(— 7^/16) < 1. 

Proof. The assumption that a is optimal implies in particular that moving the i-th element from the j-th 
position where it is mapped by a back to the i-th position does not improve the solution. The event Aij 
implies that among the elements for G [i — £, j + i] at least m/2 — £ satisfy q{k, i) = — . This means 
that at least 

2£-l> -m + - > -] (m + £) 

2 2 2 2 \2 2 J ^ ' 

of the elements for k £ [i + I, j + £] must satisfy q{k, i) = —. The probability of this occurring is less 
than 

l -^^{l/2f \ 
exp I \ =Pi 

using Chemoff bounds. □ 

As a corollary to Lemma [TT] we obtain the following using a simple union-bound. For the rest of the 
proof all the log's are base 2. 

Corollary 12. Let 

nil = (-loge + 21ogn/log(l/pi)) = 0((- lege + logn)/7^), 

then Aij does not occur for any i,j with \i — j\ > mi with probability > 1 — e. 

Next, we formulate a corollary to Lemma [TOl 

Corollary 13. Suppose that ai < a2 < ■ ■ ■ < an is the true ordering. Set m2 = 2mi. For each interval 
I = [tti, . . . ,aj] with at least m2 elements consider all the sets Sj which contain the elements from 

I — [02+7712 1 ■ ■ ■ 1 ^j—m2\i 

and are contained in the interval 

I — [0'i—m2 ) • • • ) Oj+jTt2]- 
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Then with probability > 1 — e all such sets Sj do not have an optimal ordering that has a total deviation 
from the true of more than C2 \i — j\, with 

C2 = -,= 0(7-^), 
T 

a constant. 

Proof. There are at most • 2^™2 such intervals. The probabiUty of each interval not satisfying the conclu- 
sion is bounded by Lemma [TOl with 

g— C2m27^/10 _ g— 7m2 ^ 2^'''m2 _ 2~™2 . 2" 2m2 . 2~4'"2 ^ ^ . _ 2~4m2 

The last inequality holds because 1712 > max(log n, — log e). By taking a union bound over all the sets we 
obtain the statement of the corollary. □ 

We are now ready to prove the main result on the pointwise distance between an optimal ordering and 
the original. 

Lemma 14. Assuming that the events from Corollaries [72 and \T3\ hold, if follows that for each optimal 
ordering a and for each i, |z — a"(i)| < C3 log n, where 

C3 = 500 • = 0(7-'^(- log e/ log n + 1)) 

log n 

is a constant. In particular, this conclusion holds with probability > 1 — 2e. 

Proof. Assume that the events from both corollaries hold, and let a be an optimal ordering. We say that a 
position i is good if there is no index j such that a{j) is on the other side of i from j and \cr{j) — j\ > m2. 
In other words, i is good if there is no "long" jump over i in a. In the case when i = j or i = a{j) for a long 
jump, it is not considered good. An index that is not good is bad. An interval / is bad if all of its indices are 
bad. Our goal is to show that there are no bad intervals of length > C3 log n. This would prove the lemma, 
since if there is an i with \i — (T{i)\ > C3 log n then there is a bad interval of length at least C3 log n. 

Assume, for contradiction, that / = [z, . . . , i + t — 1] is a bad interval of length t > C3 log n, such that 
i — 1 and i + t are both good (or lie beyond the endpoints of [1, . . . , 77,]). Denote by S the set of elements 
that is mapped to / by a. Denote the indices in S in their original order by ii < 72 < • • • < ^t> i-C-, we have: 
{cr(7i), . . .,a{it)} = I. 

By the goodness of the endpoints of / we have 

[i + 77T,2,'i + t - 1 - 1712] C {h, . . . ,it} C [i - m2,i + t - 1 + 1712]. 

Denote the permutation induced by cr on 5 by a' so a{ij) < (T{iji) is equivalent to cr'(j) < cr'{j'). The 
permutation a' is optimal, for otherwise it would have been possible to improve a by improving a'. 
By Corollary [13] and Claim |6l we have 

d{q^') < d{(j') < C2t. 

In how many switches can the elements of 5 participate under a? They participate in switches with other 
elements of S" to a total of d{q^i). In addition, they participate in switches with elements that are not in S. 
These elements must originate at the margins of the interval i: either in the interval [i — 7712, i + ^2] or the 
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interval [i + t — I — ni2, i + t — 1 + ma]. Thus, each contributes at most 2m2 switches with elements of S. 
There are at most 2m2 such elements. Hence the total number of switches between elements in S and in 5 
is at most A.m^. Hence 

\a{i) — i\ < ^{switches i participates in} < Am^ + 2d{q„i) < + 2c2t. (6) 

We assumed that the entire interval / is bad, hence for every position i there is an index ji such that 
Wiii) ~ 3i\ ^ and such that i is in the interval Jj = [ji, a{ji)] (or the interval ji], depending on 

the order). Consider all such J^'s. By a Vitali covering lemma argument we can choose a disjoint collection 
of them whose total length is at least |/|/3. The argument proceeds as follows: Order the intervals in a 
decreasing length order (break ties ai^bitrarily). Go through the list and add a Jj to our collection if it is 
disjoint from all the currently selected intervals. We obtain a collection Ji, . . . , of disjoint intervals of 
the for [ji, (T{ji)]. Denote the length of the i-th interval by ti = \ji — Let J/ be the "tripling" of 

the interval J,: J- = [ji — ti,a{ji) + ti]. We claim that the J^'-s cover the entire interval /. Let m be a 
position on the interval /. Then there is an interval of the form [j, a{j)] (or [<7{j),j]) that covers m. Choose 
the longest such interval J' = [j, (j{j)]. If J' has been selected to our collection then we are done. If not, 
it means that J' intersects a longer interval Jj that has been selected. This means that J' is covered by the 
tripled interval J-. In particular, m is covered by J^. We conclude that 

k k 

t = length(/) < ^length(j/) = s'^U. 

i=l i=l 

Thus Yli=i — This concludes the covering argument. 

We now apply Corollary [l2]to the intervals Jj. We conclude that on an interval Ji the contribution of the 
elements of S that are mapped to Ji to the sum of deviations under a is at least if where £i = ^jti. Thus 

^ 

ieS j=l j=l j=l 

- ■m2-t/3>m2- Y^7^ • C3 log n + • mat 

> 1712 ■ (4m2) + 2c2t = 4ml + 2c2t, 

for sufficiently large n. The result contradicts ^ above. Hence there are no bad intervals of length > 
C3 log 71, which completes the proof. □ 



4 The algorithm 

We are now ready to give an algorithm for computing the optimal ordering with high probability in poly- 
nomial time. Note that Lemma [14] holds for any interval of length < n (not just length exactly n). Set 
e = n~^~^/4. Given an input, let S C {ai, . . . , a„} be a random set of size k. The probability that there is 
an optimal ordering cr of 5 and an index i such that [i — a{i)[ > C3 logn, where 

C3 = 0(7"'(- log e/ log n + 1)) = 0{j-\p + 1)), 
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is bounded by 2e by Lemma [T4l Let 

5i c ^2 c . . . c S„ 

be a randomly selected chain of sets such that \Sk\ = k. Then the probability that an element of an optimal 
order of any of the S^s deviates from its original location by more than c^logn is bounded by 2ne = 
n"^/2. We obtain: 

Lemma 15. Let Si d . . . <Z Sn be a chain of randomly chosen subsets with \Sk\ = k. Denote by ak an 
optimal ordering on S^- Then with probability > 1 — /2, for each and for each i, \i — crfc(i)| < 
C3 log n, where C3 = 0{'y~^{(3 + 1)) is a constant. 

We are now ready to prove the main result. 

Theorem 16. There is an algorithm that runs in time n^* where 

C4 = 0(7~^(/3 + 1)) 
is a constant that outputs an optimal ordering with probability > 1 — n~^. 

Proof. First, we choose a random chain of sets 5i C . . . C 5„ such that \Sk\ = k. Then by Lemma [TSl 
with probability 1 — n"^/2, for each optimal order Uk of and for each i, \i — ak{i)\ < C3 logn. We 
will find the orders ak iteratively until we reach (7„ which will be an optimal order for our problem. Denote 
{ofc} = Sk — Sk-i. Suppose that we have computed a^^i and we would like to compute aj^. We first insert 
afe into a location that is close to its original location as follows. Break Sk into blocks Bi, B2, ■ ■ ■ , Bg of 
length C3 log n. We claim that with probability > n~^~^/2 we can pinpoint the block belongs to within 
an error of ±2, thus locating Ofc within 803 log n of its original location. 

Suppose that should belong to block Bi. Then by our assumption on a^-i, is bigger than any 
element in . . . , i?i_2 and smaller than any element in Bi^2, ■ ■ ■ , Bg. By comparing to each element 
in the block and taking majority, we see that the probability of having an incorrect comparison result with a 
block Bj is bounded by n~^~'^/2. Hence the probability that Uk will not be placed correctly up to an error 
of two blocks is bounded by n~^~^/2 using union bound. 

Hence after inserting we obtain an ordering of 5^ in which each element is at most 803 log n positions 
away from its original location. Hence each element is at most 4c3 log n positions away from its optimal 
location in cjfc. Thus, by Lemma|7]we can obtain a^. in time 0{n'^^'^^~^'^). The process is then repeated. 

The probability of each stage failing is bounded by n~^~^/2. Hence the probability of the algorithm 
failing assuming the chain Si C . . . C Sn satisfies Lemma [T5] is bounded by 77,~^/2. Thus the algorithm 
runs in time 0(n^^'^^"^^) and has a failure probabiUty of at most /2 + n~^/2 = n~^. □ 

5 Query Complexity 

Here we briefly sketch the proof of Theorem [S] Recall that the theorem states that although the running time 
of the algorithm is a polynomial of n whose degree depends on p, the query complexity of a variant of the 
algorithm is O(nlogn). Note that there are two types of queries. The first type is comparing elements in 
the dynamic programing, while the second is when inserting new elements. 

Lemma 17. For all P > 0, 7 < 1/2 there exists c{(3, 7) < 00 such that the total number of comparisons 
performed in the dynamic programing stage is 0(n log n) of the algorithm is at most cn\ogn except with 
probability 0{n"^). 
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Proof. Recall that in the dynamic programing stage, each element is compared with elements that are at 
current distance at most cq log n from it where cq = co(/3, 7). 

Consider a random insertion order of the elements ai, . . . , a„. Let S'„/2 denote the set of elements 
inserted up to the n/2 insertion. Then by standard concentration results it follows that there exists ci(co, /3) 
such that for all 1 < i < n — ci log n it holds that 

I [ai , ai + ci log n] n 8^/2 1 > cq log n, (7) 

and for all ci log n <i <nit holds that 

|[ai - cilogn,ai] nS'„/2| > cologn (8) 

except with probability at most n~^~^. Note that when Q and ^ both hold the number of different queries 
used in the dynamic programing while inserting the elements in {ai, . . . , a„} \ 5,„/2 is at most 2ci?i log n. 

Repeating the argument above for the insertions performed from S^/i to 8^/2^ from S^/g to 5„/4 etc. 
we obtain that the total number of queries used is bounded by: 

2ci log n{n + n/2 + ... + l) < 4cin log n, 

except with probability 2n~^. This concludes the proof. □ 

Next we show that there is implementation of insertion that requires only O(logn) comparisons per 
insertion. 

Lemma 18. For all (3 > Q and 7 < 1/2 there exists a C{P, 7) = 0(7~^(/3 + 1)) and c(/3, 7) = 0{-y~^{P + 
1)) such that except with probability 0{n~^ ) it is possible to perform the insertion in the proof of Theorem 
[76] 50 that each element is inserted using at most Clogn comparisons, O(logn) time and the element is 
placed a distance of at most clog nfrom its optimal location. 

Proof. Bellow we assume (as in the proof of Theorem [T6l) that there exists ci(/3, 7) = 0(7"^ (/3 + 1)) such 
that at all stages of the insertion and for each item, the distance between the location of the item in the 
original order and the optimal order is at most ci log n. This will result in an error with probability at most 
12. Let k = ^(7) = 0(7"^) be a constant such that 

P[Binik, 1/2 + 7) > k/2] > 1 - 10"^. 

Let C2 = 0(/3 + 1) be chosen so that 

P[Bin{c2 log n, 0.99) < ^ log n + 2 log2 n] < n-^~^, (9) 

Let C3 = kc2 + 4ci . 

We now describe an insertion step. Let 8 denote a currently optimally sorted set. We will partition 
5 into consecutive intervals of length between C3 log n and 2c3 log n denoted Ji, . . . , I^. We will use the 
notation /■ for the sub-interval of = [s, t] defined by I[ = [s + 2ci logn, t — 2c\ logn]. We say that a 
newly inserted element aj belongs to one of the interval /j if one of the two closest elements to it in the 
original order belongs to /j. Note that Cj can belong to at most two intervals. An element in 8 belongs to 
li iff it is one of the elements in /j. Note furthermore that if aj belongs to the interval Ij then its optimal 
insertion location is determined up to 2(A;c2 + 6ci) logn. Similarly, if we know it belongs to one of two 
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intervals then its optimal insertion location is determined up to 4(A:c2 + 6ci) logn, therefore we can take 
c = 4(A;c2 + 6ci) = 0{j-^{P + 1)). 

Note that by the choice of ci we may assume that all elements belonging to are smaller than all 
elements of I'j if i < j in the true order. Similarly, all elements belonging to Ij are larger than all elements 
of I J if j > i. We define formally the interval Iq = Iq to be an interval of elements that are smaller than all 
the items and the interval I^+i = /^'_|_^ to be an interval of elements that is bigger than all items. 

We construct a binary search tree on the set [1, t] labeled by sub-intervals of [l,t] such that the root is 
labeled by [1, t] and if a node is labeled by an interval [si, 82] with S2 — si > 1 then its two children are 
labeled by [si, s'] and [s', 82], where s' is chosen so that the length of the two intervals is the same up to 
±1. Note that the two sub-interval overlap at s'. This branching process terminates at intervals of the form 
[s, s + 1]. Each such node will have a path of descendants of length C2 log n all labeled by [s, s + 1]. 

We will use a variant of binary insertion closely related to the algorithm described in Section 3 of Bl. 
The algorithm will run for C2 log n steps starting at the root of the tree. At each step the algorithm will 
proceed from a node of the tree to either one of the two children of the node or to the parent of that node. 

Suppose that the algorithm is at the node labeled by [si, S2] and S2 — si > 1. The algorithm will first 
take k elements from -1 that have not been explored before and will check that the current item is greater 
than the majority of them. Similarly, it will make a comparison with k elements from I's^+i- If either test 
fails it would backtrack to the parent of the cuiTcnt node. Note that if the test fails then it is the case that the 
element does not belong to [si, S2] except with probability 10~^. 

Otherwise, let [si, s'] and [s', S2] denote the two children of [si, S2]. The algorithm will now perform 
a majority test against k elements from Ig' according to which it would choose one of the two sub-interval 
[si, s'] or [s', 82]- Note again that a con^ect sub-interval is chosen except with probability at most 10~^ (note 
that in this case there may be two "correct" intervals). 

In the case where S2 = si + 1 we perform only the first test. If it fails we move to the parent of the node. 
It it succeeds, we move to the single child. Again, note that we will move toward the leaf if the interval is 
correct with probability at least 0.99. Similarly, we will move away from the leaf if the interval is incorrect 
with probability at least 0.99. 

Overall, the analysis shows that at each step we move toward a leaf including the correct interval with 
probability at least 0.99. From Q it follows that with probability at least 1 — n~^~^ after C2 logn steps the 
label of the cuiTcnt node will be [s, s + 1] where the inserted element belongs to either Ig or Ig+i- Thus 
the total number of queries is bounded by 3kc2 logn and we can take C = 3kc2 = 0(7~^(/3 + 1)). This 
concluded the proof. □ 
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